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ABSTRACT 


We  typically  think  of  fitting  data  with  an  approximating  curve  in  the  linear 
least  squares  sense,  where  the  sum  of  the  residuals  in  the  vertical,  or  y,  direction  is 
minimized.  The  problem  addressed  here  is  to  fit  a  Bezier  curve  to  an  ordered  set 
of  data  in  the  total  least  squares  sense,  where  the  sum  of  the  residuals  in  both  the 
horizontal  and  vertical  directions  is  minimized.  More  exact:  given  an  ordered  set  of  m 
data  points  d,,  i  =  1, 2, . . . ,  m  find  a  set  of  control  points  bj,  i  =  0, 1, . . . ,  n  where  n  is 
the  order  of  the  Bezier  curve,  and  a  vector  t  of  nodes,  0  <  ti  <  t2  <  •  •  •  <  tm  <  1  that 
minimize  ||  B( t)  P  —  D  ||f-  The  matrix  D  contains  the  data  points,  the  matrix  P 
contains  the  control  points,  and  the  matrix  B( t)  is  a  Bernstein  matrix.  The  algorithm 
to  accomplish  this  is  explained  in  detail  and  makes  extensive  use  of  the  linear  algebra 
representation  of  Bezier  curves. 
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I. 


INTRODUCTION 


Notation  used  in  this  paper  is  as  follows:  matrices  are  slanted  upper  case 
letters,  vectors  are  lower  case  bold  letters,  and  scalars  are  slanted  lower  case  letters. 
Also,  vectors  are  column  arrays  unless  otherwise  indicated.  For  ease  of  discussion,  a 
point  and  a  vector  are  equivalent. 


A.  BERNSTEIN  POLYNOMIALS 


Bernstein  polynomials  have  the  following  general  form: 

^n\  (t  —  aY(b  —  t)r 


Bm  = 


( b  —  a)n 


",  *  —  0,1,..., n 


(1.1) 


where  t  6  [a,  6]. 

We  will  only  consider  the  case  where  a  =  0  and  6=1.  Equation  (1.1)  now 
takes  on  the  form; 


B”(*)=  ("Wfl- -*)**'>  *  =  0,1,...,!*  (1.2) 

which  is  a  scalar  for  a  particular  t  £  [0,1].  From  Equation  (1.2)  note  that  B°(t)  =  1. 
Also,  we  define  Bf^t)  =  0  if  j  <  0  or  j  >  n. 

The  set  of  Bernstein  polynomials  of  degree  n  form  a  basis  for  Vn,  the  space  of 
polynomials  of  degree  n  or  less.  The  linear  transformation  from  power  basis  coeffi¬ 
cients  to  Bernstein  basis  coefficients  is  explained  in  detail  in  [Ref.  1].  For  a  complete 
discussion  on  the  properties  of  Bernstein  polynomials  see  [Ref.  2]. 


B.  BEZIER  CURVES 

Bezier  curves  are  named  after  P.  Bezier  and  are  used  extensively  in  computer 
aided  geometric  design.  Before  presenting  the  general  form  for  a  degree  n  Bezier 
curve,  let  us  look  at  an  example. 

Consider  two  points  on  the  x-axis  given  by  bp  =  (2,0)  and  bf  =  (4,0),  and 
suppose  that  we  want  to  describe  a  degree  1  curve  between  these  two  points.  A 
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parametric  representation  of  this  curve  is; 

x{t )  =  2  +  2 t,  y(t)  =  0,  t  €  [0 , 1] 

Note  that  since  Bj(t)  for  i  =  0, 1  is  a  basis  for  V\,  we  can  write  x(t)  and  y(t)  in  terms 
of  Bernstein  polynomials.  Using  matrices,  we  have; 

[*(*),  y(*)]  =  [2  +  2i,0] 

=  [2(1  —  t)  +  4t ,  0(1  —  t )  +  Of  ] 

2  0 
4  0 
bT  * 

=  Bl(t)  B't(t)  °  (1.3) 

L  . 

The  parametric  representation  of  a  curve  in  the  form  of  Equation  (1.3)  is  of  central 
importance  in  this  paper.  Also,  all  examples  in  this  paper  are  in  2-dimensions,  but 
it  is  understood  that  this  representation  holds  for  higher  dimensional  space  as  well. 
In  general,  a  Bezier  curve  of  degree  n,  denoted  bo(f),  can  be  written  as 

W«)T  =  i+rmbf  (1.4) 

i—O 

=  +  BTWb?1  +  ■  •  •  + 

where  bp(t)  is  a  point  on  the  curve  for  a  particular  t  €  [0 , 1].  The  points  b,,  i  =  0, n 
are  called  control  points.  We  see  that  a  point  on  a  Bezier  curve  is  a  weighted  sum  of 
control  points,  where  the  weights  are  Bernstein  polynomials  evaluated  at  a  particular 
value  of  t. 

Let  us  look  at  an  example  of  a  cubic  Bezier  curve  before  discussing  properties 
of  these  curves.  Figure  (1)  on  page  9  shows  four  control  points  and  a  curve  starting 
at  control  point  b0  and  ending  at  control  point  b3.  Note  that  we  need  n  +  1  control 
points  for  a  degree  n  curve.  Also  note  that,  in  this  example,  the  curve  does  not  pass 
through  bi  or  b2- 


=  Bi(t)  Bl(t) 
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Figure  (2)  on  page  9  shows  how  the  curve  changes  when  the  positions  of  control 
points  bi  and  b2  are  interchanged.  This  is  why  Bezier  curves  are  so  widely  used  in 
computed  aided  geometric  design;  by  changing  the  position  of  one  or  more  control 
points  the  user  can  easily  change  the  shape  of  the  curve. 

C.  PROPERTIES  OF  BEZIER  CURVES 

The  following  properties  are  basic  to  understanding  Bezier  curves  and  provide 
the  necessary  background  for  discussion  later  in  this  paper. 

1.  Endpoint  Interpolation 

Bezier  curves  interpolate  between  the  end  control  points  bo  and  bn.  We  saw 
in  Figure  (1)  that  the  Bezier  curve  had  bo  as  one  endpoint  and  b3  as  the  other.  This 
is  always  the  case  and  is  shown  using  Equation  (1.4)  with  t  =  0  and  t  =  1.  Because 
Bf( 0)  =  0  except  for  i  =  0,  and  B*(  1)  =  0  except  for  i  =  n,  we  have; 

(bS(0))T  =  ix(  0)bf  =  bj 

and 

i=0 

Though  Bezier  curves  are  guaranteed  to  begin  at  bo  and  end  at  bn,  it  is  not  guaranteed 
that  they  pass  through  any  of  the  intermediate  control  points  bj,  b2, . . . ,  bn_i. 

2.  Affine  Invariance 

An  affine  transformation  is  of  the  form  3>(p(f))  =  Ap(t)  +  c.  Examples  of 
affine  transformations  are  scaling,  rotation,  and  reflection.  The  equation  describing 
Bezier  curves  is  affine  invariant  in  that  given 

(bS(i)f  =  tm)  bf 

i= 0 

under  the  transformation  $  we  will  have 

*((bS(i))T)=EBr(*mbf) 

t=0 
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In  other  words,  given  an  affine  transformation  $,  performing  the  following  two  steps 
produces  equivalent  results. 

1.  Evaluate  Equation  (1.4)  for  a  given  set  of  control  points  and  a  value  t  to 
obtain  (b£(t))T,  and  then  apply  $  to  (b£(t))T. 

2.  Apply  $  to  each  of  the  control  points  and  then  evaluate  Equation  (1.4) 
using  the  transformed  control  points  and  the  same  value  £  as  in  1. 

For  example,  suppose  we  want  to  rotate  a  given  Bezier  curve.  We  can  either  transform 
the  relatively  few  control  points  or  transform  all  the  points  of  the  curve.  In  general, 
transforming  the  control  points  and  re-plotting  the  curve  costs  much  less.  Other 
properties  of  the  Bezier  curve  are  discussed  in  detail  in  [Ref.  2]. 


D.  MATRIX  REPRESENTATION  OF  BEZIER  CURVES 

In  this  section  we  represent  Bezier  curves  in  linear  algebra  form.  This  paper 
makes  extensive  use  of  this  representation  for  Bezier  curves  to  simplify  their  manip¬ 
ulation. 


1.  The  Bernstein  Matrix 

Equation  (1.4)  showed  that  a  Bezier  curve  is  described  by 


« (<))T  =  £B,n(i)b?,  <€[0,1] 

i=0 

This  is  equivalent  to  the  linear  algebra  form: 


(b;(())T  =  «(<)  •  •  •  B„"(<)1 


=  B(t)P 


(1.5) 


where  we  will  refer  to  B(t )  as  a  Bernstein  matrix.  A  Bernstein  matrix  is  a  generalized 
Vandermonde  matrix  which,  for  a  vector  t  =  [tl5 12,  ■  ■ . ,  tm]T ,  f,  €  [0,1]  and  a  given 
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degree  n,  has  the  form; 


BS(h)  Bf(  <i)  •••  B*(U) 

BS{h)  BJ(h)  •••  BZ(tt) 

B(i)  = 

•••  B"(tm) 

Therefore,  we  see  that  for  t  G  'Rm  and  a  given  degree  n  we  have  B(t)  G  7£mx(n+1). 

In  2-dimension,  the  matrix  P  contains  the  x  and  y  coordinates  of  the  control 
points  and  has  the  following  form: 

xo  y0 
P=  i  ! 

Vn 

For  example,  if  instead  of  evaluating  Equation  (1.5)  for  a  particular  t  we  are 
interested  in  getting  m  =  3  points  on  a  Bezier  curve  of  degree  n  =  3  we  would 
evaluate; 

(bS(t))r  =  B(  t)  P 

where  B{ t)  6  7i3x4  and  P  €  P4x2.  Therefore,  (b^t))7  G  P3x2  and  is  a  matrix  of 
points  on  the  curve. 

As  another  example,  let  n  —  3  and  only  consider  a  particular  t.  We  will  expand 
the  elements  of  B(t)  to  further  demonstrate  the  inherent  linear  algebra  form  of  Bezier 
curve  representation.  Note  that  we  can  write 

B3(t)  -t3  +  3t2  -  3t  +  1 

Bf(t)  3 13  -  6t2  +  3 1 

B(tf  = 

B3{t)  -St3  +  St2 

B3(t)  t3 
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and,  therefore,  we  see  that 


-1 

3 

-3 

1 


3 

-6 

3 

0 


-3  1 
3  0 
0  0 
0  0 


’  t 3 " 

t2 

t 

1 

=  Mt  Tt 


where  T  £  7Zlx4  for  the  array.  Equivalently,  we  have; 

B{t)  =  TM 


(1.6) 


If  we  now  consider  a  vector  t  £  TZ"1,  then  instead  of  T  £  7£lx4  we  would  have 
T  £  7£TOX4  .  For  a  given  matrix  of  control  points  we  get  the  following  equation  which 
describes  m  points  on  a  degree  3  Bezier  curve. 


(b?(t)) 


T  _ 


t\  t\  1 


t3  t2  t  1 

Lm  Lrn  i 


-1 

3 

-3 

1 


3  -3  1 

-6  3  0 

3  0  0 

0  0  0 


Xo 

yo 

X\ 

yi 

X2 

2/2 

.  XZ 

y 3  _ 

=  TMP 
=  B(t)P 

2.  Slope  of  a  Bezier  Curve 

Now  that  we  have  an  equation  for  a  degree  n  Bezier  curve,  we  will  derive  the 
equation  for  the  first  derivative  at  a  particular  t,  which  in  turn  gives  the  slope  at  a 
point  on  the  curve.  Since  Bezier  curves  are  parametric  in  t,  we  want  to  solve 


dt 


d(K(t))T  =  J,±B?(t)bJ 

ai  t=0 

=  tjB?  (t)bj 

t'=0  ai 


From  [Ref.  2:  page  46]  we  have 

d 


dt 
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and,  therefore, 


dt 


*= 0 


=  (n[0  -  BJ-’MDbJ  +  (n[B0“-‘(<)  -  B?-'(t)])bf  +  . . . 
+(»[ B'nZ\{t)  -  0])bj 


After  combining  like  terms  of  n  Bf  1  (t)  we  have,  finally 
d 


dt 


(K(t))T'  =  nBr1mb^-bT0)  +  nBr1(t)(bT2-bT1)  + 
+nB:zl(t)(bTn-bl_1 ) 

=  nJ2B^-\t)(bJ+1-bJ) 


2=0 

=  -Esr  wAbf 

i= 0 


(1.7) 


where  A  is  the  forward  difference  operator. 

Equation  (1.7)  lends  itself  to  representation  in  linear  algebra  form  as  follows; 

bo 

bf 


d 

dt 


(b;(<)f  =  n  [BJ-'W  Br'W  •••  A 


>T 

Jn 


=  nB(t)  A  P 

where  A  €  77nAn+1)  and  has  the  form 

-1  1  0 

0  -1  1 


0 

0 


0 

0 


0  .  0  -1  1 

For  example,  consider  the  case  where  n  =  3.  Since  we  have  that 


m))T = 


Bl(1)  ' 

1 

i — ( 

+ 

CN 

1 

<N 

1 

-2 

1 

t- 2 

B?« 

= 

-2 t2  +  2 1 

= 

-2 

2 

0 

t 

X - 

to 

to  to 

C-4- 

1 _ 

t2 

1 

0 

0 

1 
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we  can  write 


=  *[t2  t  l] 


1  -2  1 

-2  2  0 

1  0  0 


-1  1  0 

0  -1  1 

0  0-1 


We  can  obtain  the  first  derivative  at  m  points  along  a  Bezier  curve  by  evaluating 
Equation  (1.7)  for  a  given  vector  t  €  7£m  and  €  [0,1].  For  example,  when  n  =  3 
we  have; 


CN  r-t 

1 _ 

ti 

1 

r 

SH*. 

ow 

rt* 

II 

CO 

t2 

l2 

CN  .  ,  , 

*+-i 

1 

t 2 
Lm 

l 

- 

1 

0 

0 


-1  1  0 

0  -1  1 

0  0-1 


There  are  similar  equations  for  higher  derivatives,  for  these  see  [Ref.  2]. 
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Figure  1.  Cubic  Bezier  Curve  with  Control  Points 
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Figure  2.  Cubic  Bezier  Curve  with  Reordered  Control  Points 
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II 


PROBLEM  STATEMENT 


This  chapter  presents  the  problem  of  fitting  a  given  ordered  set  of  data  with 
a  Bezier  curve  in  the  total  least  squares  sense.  We  will  see  how  the  linear  algebra 
representation  of  Bezier  curves  lends  itself  to  solving  this  problem.  We  refer  to  the 
point  along  a  Bezier  curve  determined  by  a  particular  node  U  as  the  point  t*.  So, 
from  Equation  (1.4)  we  have 

Ti  =  b*(ti) 

A.  PROBLEM  STATEMENT 

The  problem  to  solve  is  stated  as  follows:  given  an  ordered  set  of  m  data 
points  d,-,  i  =  1, 2, . . . ,  m  find  a  set  of  control  points  b«,  i  =  0, 1, . . . ,  n,  where  n  is 
the  order  of  the  Bezier  curve,  and  a  vector  t  of  nodes,  0  <  t\  <  t2  <  •  •  •  <  tm  <  1 
that  minimize 

\\B(t)P  -  D\\f  (II.1) 

where  the  matrix  D  contains  the  data  points.  Though  neither  the  matrix  P*  nor 
the  vector  t*  that  minimize  Equation  (II.  1)  are  unique,  and  though  in  practice  the 
resulting  value  of  Equation  (II. 1)  is  determined  in  part  by  the  stopping  criteria  of 
the  algorithm,  for  ease  of  reference  we  will  equate  minimizing  Equation  (I-I.l)  with 
determining  {P*,t*}.  Also,  we  will  only  consider  the  case  where  n  <  m  since  for 
n  >  m  we  could  produce  a  curve  which  passes  through  all  the  data  points  and  this  is 
uninteresting  in  the  context  of  this  paper. 

Notice  that  we  are  minimizing  the  Frobenius  norm  of  the  residual  in  Equa¬ 
tion  (II. 1).  For  A  €  TZmXn,  this  norm  is  given  by 

(m  n  \  2 

;=i j- i  / 
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Consider  the  case  m  =  3  and  n  =  1.  The  residual  in  Equation  (II. 1)  is  then 


BUh)  B‘(U) 

B'0(h)  B'Ah) 
B'o(tz)  B'(is) 


However,  minimizing  the  Frobenius  norm  is  the  same  as  minimizing  the  2  norm  of 


Bit*.)  «,'((,) 

Bh(t 2)  B\{t2) 
BlM  B\{h) 
0  0 
0  0 
0  0 


0  0 

d\jX 

0  0 

bo,x 

0  0 

bi,x 

^3,1 

mu)  b\(u) 

t>0,y 

d\,y 

BHU)  Bl(U) 

^2,3 y 

B'M  B>((,). 

.  ^3-y  . 

(n.2) 


where,  for  example,  b0,y  denotes  the  y  component  of  the  vector  b0.  We  will  call 
Equation  (II. 2)  the  uncoupled  form  of  the  residual  in  Equation  (II. 1). 


B.  THE  SEPARABLE  LEAST  SQUARES  PROBLEM 

Minimizing  Equation  (II. 1)  is  a  nonlinear  least  squares  problem  because  it  is 
nonlinear  in  t.  Minimizing  Equation  (II. 1)  is  also  separable  because  we  can  separate 
out  from  the  original  nonlinear  problem  a  linear  least  squares  parametric  functional 
problem  for  the  matrix  P.  In  particular,  for  a  given  vector  t,  the  minimizing  matrix 
P  of  the  residual  in  Equation  (II.  1)  satisfies 

P  =  B+(t)D 

where  B+{ t)  is  the  Moore-Penrose  generalized  inverse,  or  pseudo-inverse,  of  Z?(t). 

Note,  since  we  have  that  P  =  B+(t)D  for  a  given  t  that  we  are  able  to  separate 
the  unknown  matrix  P  from  Equation  (II.l).  Therefore,  the  optimal  vector  t  is  found 
by  minimizing  the  variable  projection  functional 

||  B(t)B+(t)D  -  D  ||f 
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Further,  consider  the  nonlinear  least  squares  problem 


’  Bo2(i.) 

BK*i) 

B(t)P  = 

_  Bl(U) 

Bf(U) 

B&U)  _ 

_br  _ 

— 

_dI . 

where  the  matrix  P  €  PL4x2  and  the  vector  t  £  7£4  are  unknown  and  n  =  2.  Note  that 
if  the  vector  t  is  given,  then  our  least  squares  problem  becomes  linear.  If,  instead,  we 
are  given  a  matrix  P,  then  Equation  (II. 3)  is  a  nonlinear  least  squares  problem  for 
the  vector  t  only. 

C.  SHORTEST  DISTANCE  TO  A  CURVE 

Given  an  ordered  set  of  data,  minimizing  Equation  (II.  1)  determines  the  points 
r  which  are  nearest  to  their  associated  data  points  for  a  particular  control  point 
matrix  P.  We  will  refer  to  these  particular  points  as  the  nearest  points.  For  example, 
let  TijX  and  r^y  represent  the  x  and  y  values  of  the  point  r,-  for  a  particular  matrix  P. 
We  saw  that  the  nodes  and  control  points  that  minimize  Equation  (II.  1)  also  minimize 
the  2  norm  of  Equation  (II.2).  This  is  equivalent  to  minimizing  the  objective  function ; 

(l~l,x  dl,x)  d~  '  '  '  d"  dm^x)  d"  (^l,y  ^l,y)  d"  ■  ■  *  d"  (II. 4) 

For  a  particular  matrix  P,  a  change  in  the  node  U  only  affects  the  point 
Tj.  Therefore,  Equation  (II. 4)  can  be  broken  down  into  m  independent  objective 
functions.  For  example,  the  node  fi  only  affects  the  following  terms  of  Equation  (II.4) 

(T’l,a:  dl,x)  d"  (Wy  ^liv)  (II. 5) 

Minimizing  Equation  (II. 5)  is  equivalent  to  determining  the  nearest  point  for  data 
point  (dj)X ,  dity  ).  Proceeding  like  this  for  each  node,  we  can  determine  each  nearest 
point  and  minimize  Equation  (II.  1)  for  a  particular  matrix  P.  So  we  see  that  with  a 
given  matrix  P  the  nonlinear  least  squares  problem  for  the  vector  t  is  equivalent  to 
solving  the  nearest  point  problem. 
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1.  A  Necessary  Condition 


Treating  t\  as  a  variable,  Equation  (II. 5)  is  minimized  by  finding  the  stationary 
point  of 


—  (rl,Z  ^l,x)  "b  dlty) 


The  stationary  point  is  such  that 


^  9{t  1)  —  2(tx}X  d\tX) ^  T-[tX  -f  2(tii3/  d] u  y 


In  matrix  notation  this  becomes 

i~i,x  d\tX 

Tl,y  ~  diiV 

So,  we  see  that  the  point  rt-  which  minimizes  Equation  (II. 5)  is  perpendicular  to 
the  tangent  vector  at  that  same  point.  In  general,  this  is  a  necessary  condition  for 
minimizing  Equation  (II. 5). 

Figure  (3)  on  page  18  shows  a  given  ordered  set  of  data  and  the  degree  2 
Bezier  curve  produced  from  a  given  set  of  control  points.  Figure  (4)  on  page  18  shows 
the  points  t  gotten  by  solving  the  nonlinear  least  squares  problem  for  the  vector  t. 
Notice  that  each  rt-  satisfies  the  necessary  condition  except  for  the  point  r4.  Because 
our  parameter  values  are  constrained  within  [0, 1],  <4  =  1  is  the  best  we  can  do. 

Now,  consider  again  the  problem  of  minimizing  Equation  (II. 4)  except  we  are 
given  an  initial  set  of  nodes.  Figure  (5)  on  page  19  shows  the  same  data  set  as  used 
in  Figure  (3)  and  the  node  associated  with  each  data  point.  The  nodes  were  chosen 
uniformly  spaced  on  the  interval  [0 , 1]  and  are  given  by 


_d_  _  _d_ 

(ft,  1.*  (ft,  b» 


0 

.33 

.67 

1 


The  degree  of  the  Bezier  curve  we  want  to  fit  to  the  data  is  again  two.  After  solving 
the  linear  least  squares  parameter  functional  problem  for  the  control  points,  we  can 
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plot  the  Bezier  curve.  Figure  (6)  on  page  19  shows  the  resulting  Bezier  curve  along 
with  the  points  r.  Note  that  point  r3  does  not  minimize  Equation  (II. 5)  or  satisfy  the 
necessary  condition.  What  the  least  squares  solution  for  the  control  points  produced 
was  the  best  answer  possible  given  the  initial  set  of  nodes  and  desired  degree  of  the 
Bezier  curve. 


D.  ORDERED  DATA 

In  the  problem  statement  at  the  beginning  of  the  chapter,  we  are  given  an 
ordered  set  of  m  data  points.  We  will  order  the  data  with  respect  to  t  because  Bezier 
curves  are  parametric  in  t.  So,  tj  >  tj  implies  that  the  data  point  associated  with  tj 
is  ordered  after  the  data  point  associated  with  U.  Therefore,  the  problem  of  ordering 
a  set  of  data  with  respect  to  t  becomes  one  of  determining  an  initial  set  of  nodes. 

Consider  a  set  of  m  data  points  and  that  we  want  to  determine  a  set  of  control 
points  which  will  produce  an  approximating  Bezier  curve.  We  need  an  initial  set  of 
nodes  in  order  to  solve  the  least  squares  problem  for  the  matrix  P,  and  we  want  the 
initial  set  of  points  r  in  the  neighborhood  of  the  nearest  points.  The  reason  for  this 
last  condition  on  the  initial  set  of  points  t  is  explained  in  Chapter  III.  Since  the 
elements  of  t  must  be  contained  in  [0,1],  we  might  use  either  the  uniform  method 

i  - 1  .  , 

U  = - -,  i  =  l,...,m 

m  —  1 


or  the  chord  length  method 


t{  —  -J- 


d;  —  dj_i 


TZ,  II  di  -  di-!  ||,’ 


i  =  2,3, . . .  ,m 


where  we  define  t\  =  0. 

The  main  problem  with  the  uniform  method  is  that  it  does  not  take  into 
account  nonuniform  distribution  of  the  data.  The  main  problem  with  the  chord 
length  method  is  that  it  is  not  invariant  under  all  affine  transformations  [Ref.  3].  For 
example,  consider  a  set  of  data  and  that  the  user  wants  to  scale  one  of  the  components 
by  a  constant  while  using  the  same  initial  set  of  nodes  in  the  new  scaling.  Because 
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the  chord  length  method  is  not  invariant  under  scaling,  the  resulting  Bezier  curves 
would  not  have  the  same  relationship  to  the  data. 

The  affine  invariant  chord  method  as  described  in  [Ref.  3]  takes  into  account 
nonuniform  spacing  of  data  and  is  based  on  the  following  metric 


I  A+i  -  Di^  =  [x«+i  ~  y«+ 1  “  Vi]  A  [xt+]  -  Xi,  t/i+1  -  yi\T 


=  [xi+i-Xi,yi+i-yi] 


rx 

9 

&XY 


_ gyy 

9 


yi+i  ^  y% 


(II  7) 


where  Di  and  D,+i  are  successive  data  points,  <j\  is  the  sample  variance  in  the  y 
components  of  the  entire  data  set,  a\  is  the  sample  variance  in  the  x  components  of 
the  entire  data  set,  ctxy  is  the  sample  covariance  in  the  x  and  y  components  of  the 
entire  data  set,  and 
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A-1 

aXaY  -  (°XY)2 


The  matrix  A  is  the  inverse  of  the  covariance  matrix  used  in  the  bivariate  normal 
distribution  function. 

The  term  metric  means  a  way  to  measure  distance.  Normally,  we  think  of  the 
Euclidean  metric 


||  A+i  -  Di  ||2  =  [xi+1  -  x,-,  yi+ 1  -  yi]  I  [xi+1  -  Xi ,  yi+1  -  y,-]T 
=  (xt+1  -  X,)2  +  (t/i+i  -  yi)2 

Note  that  in  the  case  of  the  Euclidean  metric  A  =  I.  The  reason  is  that  the  Euclidean 
metric  assumes  no  scaling  or  correlation  between  the  data  points. 

An  improvement  on  the  affine  invariant  chord  method  is  the  affine  invariant 
angle  method.  This  method  combines  the  metric  in  Equation  (II. 7)  and  the  bending 
of  the  data.  Let  three  noncollinear  data  points  be  the  vertices  of  a  triangle.  The 
bending  of  the  data  is  how  much  of  a  turn  is  made  when  going  from  one  side  of  the 
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triangle  to  another  at  a  vertex.  For  details  of  the  construction  of  this  method  see 
[Ref.  4].  The  primary  algorithm  of  this  paper  to  solve  the  problem  stated  at  the 
beginning  of  the  chapter  uses  the  affine  invariant  angle  method  to  obtain  the  initial 
set  of  nodes. 
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Figure  3.  Data  Points  and  Fitted  Curve 
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Figure  4.  Solution  for  the  Nearest  Points 
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1  2  3  4  5 

Figure  6.  Solution  to  Linear  Least  Squares  Problem 
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III.  BASIC  ALGORITHM 


This  chapter  presents  the  basic  algorithm  used  in  this  paper  to  determine 
and  also  presents  three  methods  used  to  solve  the  nearest  point  problem. 
Finally,  the  primary  algorithm  to  determine  {P*,t*}  is  presented. 

A.  BASIC  ALGORITHM 

The  basic  algorithm  used  in  this  paper  to  determine  makes  use  of 

the  separability  of  minimizing  Equation  (II.  1)  and  follows  an  alternating  projection 
approach.  The  basic  steps  are  as  follows: 

1.  Determine  the  initial  set  of  nodes. 

2.  Solve  the  linear  least  squares  parametric  functional  problem  for  the  control 
points. 

3.  Solve  the  nonlinear  least  squares  problem  for  the  nearest  points. 

4.  Repeat  steps  two  and  three  until  the  algorithm  reaches  the  stopping  criteria. 

The  difference  between  the  three  MATLAB  functions  used  in  researching  this  paper 
is  the  method  each  uses  to  accomplish  step  three  of  the  basic  algorithm. 

1.  Approaching 

Let  us  look  at  the  idea  behind  the  basic  algorithm  as  a  solution  technique  for 
determining  {P*,t*}.  Given  an  ordered  set  of  data  and  an  initial  vector  of  nodes  ti, 
after  solving  the  least  squares  problem  for  the  matrix  P\  we  would  have 

||  B(t1)P1  -  D  ||F=  <jj  (III.l) 

Note  that,  in  general,  the  Bezier  curve  will  not  pass  exactly  through  all  the  data 
points. 

As  we  saw  in  Figure  (6)  on  page  19  the  initial  set  of  points  r  will  most  likely 
not  be  the  nearest  points,  so  some  improvement  to  ti  is  possible.  Solving  the  nearest 
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point  problem  for  t2  gives 


||  B(t2)Pi  -  D  ||f=  cr2  <  <7i 

Now  we  have  an  improved  set  of  nodes  but  we  have  not  improved  the  initial 
Bezier  curve  as  described  by  P\.  If  there  is  a  better  fitting  Bezier  curve  using  the 
vector  of  nodes  t2  then  solving  the  least  squares  problem  for  P2  will  result  in 

II  B(t2)P2  -  D  ||F=  03  <  02 

In  this  manner,  we  approach  {P*,t*}.  The  above  argument  is  not  a  proof  of 
convergence. 

B.  THE  NEARESTPOINT  METHOD 

Given  a  matrix  P,  the  NearestPoint  method  is  a  code  found  in  [Ref.  5]  which 
determines  the  nearest  point  associated  with  each  data  point  d,  by  numerically  solving 
for  the  roots  of 

Ki)p  -  dH  ■  5;B(i')P = 0 

For  example,  for  a  degree  3  Bezier  curve,  the  left-hand  side  of  the  above  equation 
is  a  degree  5  polynomial.  A  subroutine  called  FindRoots  returns  the  real  roots  of 
the  resulting  polynomial  which,  after  ensuring  they  are  within  the  interval  [0,  1],  are 
used  to  determine  points  on  the  curve.  The  distance  from  the  data  point  d,  to  each 
determined  point  on  the  curve  along  with  the  distance  from  the  data  point  to  each 
endpoint  of  the  curve  is  compared,  and  the  shortest  distance  indicates  the  parameter 
value  of  the  nearest  point.  Because  this  is  a  numerical  method,  there  is  some  degree 
of  error  in  the  solution. 

There  are  two  notable  differences  between  this  method  and  the  two  other 
methods  used  to  accomplish  step  three  of  the  basic  algorithm.  First,  whereas  the 
NearestPoint  code  considers  each  nearest  point  separately,  the  linear  algebra  repre¬ 
sentation  of  Bezier  curves  is  used  by  the  two  other  methods  to  consider  all  the  nearest 
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points  at  once.  Second,  note  again  that  the  basic  algorithm  is  an  alternating  pro¬ 
jection  approach  where  a  new  curve  is  produced  with  each  new  matrix  P.  In  other 
words,  the  problem  changes  each  time  step  two  of  the  basic  algorithm  is  performed. 
So,  when  accomplishing  step  three  of  the  basic  algorithm  the  two  other  methods  do 
not  continue  to  iterate  to  reach  a  numerical  solution  to  the  nearest  point  problem. 
Instead,  the  two  other  methods  only  move  one  step  in  the  direction  of  the  nearest 
points. 


C.  THE  GRADIENT  METHOD 

The  gradient  method  resulted  from  seeing  the  nonlinear  least  squares  problem 
for  the  nearest  points  as  an  optimization  problem  requiring  a  gradient  search  tech¬ 
nique  [Ref.  6:  220].  From  an  initial  set  of  nodes,  we  want  to  change  each  value  so 
that  the  points  r  approach  the  nearest  points.  However,  instead  of  holding  to  the 
formal  gradient  search  method  we  proceed  as  follows. 

Recall  that  in  Chapter  II  we  showed  that  the  nearest  point  on  a  Bezier  curve 
from  a  given  data  point  d,  will  minimize 


Tf  -  d,  Hr 


(III.2) 


and  satisfy  the  necessary  condition 


WJ i-* 


t~i,x  d\tX 

Tl,y  —  dliV 


=  0 


(III.3) 


Therefore,  if  we  have  an  initial  point  r,-  in  the  neighborhood  of  the  nearest  point, 
then  by  finding  the  point  t,  which  satisfies  Equation  (III.3)  we  find  the  nearest  point. 
Recall  that  the  cosine  of  the  angle  between  two  vectors  a  and  b  is 


cos  6  = 


a  •  b 

II  a  llll  b  || 


Let  a  =  and  b  =  rt  —  d,.  Given  a  point  rt-,  which  is  in  the  neighborhood  of  the 
nearest  point,  we  want  to  change  the  node  ti  so  that  a  •  b  approaches  zero.  Note  that 
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the  closer  we  get  to  satisfying  Equation  (III. 3)  the  smaller  the  magnitude  of  a  •  b. 
Therefore,  we  use  a  •  b  to  not  only  tell  us  what  direction  to  move  the  node  tj  but  also, 
it  was  initially  thought,  to  give  sufficient  indication  of  how  far  to  move. 

Consider  the  following  example.  Figure  (7)  on  page  30  shows  four  ordered 
data  points  and  a  Bezier  curve  fitted  to  the  data  points.  Since  6  <  |  we  have  that 
a  •  b  >  0.  This  tells  us  to  move  point  r3  to  the  left,  which  corresponds  to  decreasing 
the  value  t3.  We  change  t3  with 

f3  =  f3  —  At3  (III.4) 

where  A t3  is  a  scalar  multiple  of  a  •  b.  This  iterative  process  is  likened  to  bracketing 
the  nearest  point. 

1.  Results  of  the  Gradient  Method 

The  first  problem  with  the  method  as  presented  is  that  Equation  (III. 3)  is  not 
a  sufficient  condition  for  minimizing  Equation  (III. 2).  Starting  out  with  an  initial  set 
of  points  t  in  the  neighborhood  of  the  nearest  points  may  overcome  this  problem  in 
most  cases,  but  it  would  still  require  an  added  check  in  any  algorithm. 

Secondly,  when  we  fit  higher  degree  Bezier  curves  to  data  the  nearest  point 
problem  becomes  nonlinear  and  in  these  cases  this  method  as  presented  is  insufficient 
to  consistently  move  the  points  r  closer  to  the  nearest  points.  The  problem  of  de¬ 
termining  the  correct  step  size  to  take  in  the  direction  of  the  gradient  requires  more 
effort  than  relying  on  a  user  defined  value  of  some  scalar  multiplying  a  •  b.  Instead  of 
attempting  a  possibly  involved  search  technique  to  overcome  this  obstacle,  a  better 
solution  technique  for  the  nearest  points  was  sought. 

D.  THE  GAUSS-NEWTON  METHOD 

For  a  given  matrix  P,  the  Gauss-Newton  method  is  a  better  way  to  solve  the 
nonlinear  least  squares  problem  for  the  nearest  points. 
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1.  Review  of  Newton’s  Method 

Under  appropriate  conditions  on  the  function  /  :  t  — »  /(t),  Newton’s  method  is 
known  to  converge  quadratically  for  initial  estimates  in  the  neighborhood  of  the  roots. 
The  idea  behind  Newton’s  method  is  that  near  a  root  we  can  model  the  behavior  of 
the  function  with  the  following: 

Mc(t)  =  f(tc)  +  f(te)(t  -  te)  (IIL5) 

where  Mc(t )  is  a  linear  approximation  of  the  function  /  and  tc  is  an  estimate  to  a  root 
of  /.  The  value  t  such  that  Mc(t)  =  0  is  an  improved  approximation.  This  improved 
approximation  is  then  used  as  the  next  estimate.  So,  we  see  that  Newton’s  method 
is  iterative  with  each  iteration,  or  step,  bringing  the  estimate  closer  to  a  root  of  /. 
This  is  also  the  idea  behind  the  Gauss-Newton  method  for  solving  the  nonlinear  least 
squares  problem. 

2.  Solving  the  Nonlinear  Least  Squares  Problem 

For  a  detailed  discussion  on  the  Gauss-Newton  method,  see  [Ref.  7].  Let 
control  point  matrix  P  €  7Z.mx2  be  known  and  that  we  want  to  solve  the  following: 


B(t)P  -D  =  0  (III. 6) 

where  0  is  an  mx2  matrix  of  zeros.  To  change  Equation  (III.6)  to  a  form  solvable 
using  the  Gauss-Newton  method,  we  need  to  uncouple  the  left  hand  side  as 


'  BZih)  B?(h)  •••  £2(U)  0  0  •••  o' 

di,x 

Bg(t2)  B?(i2)  •••  B*(t2)  0  0  •••  0 

4 

d2,x 

B% (tm)  B?(tm)  •••  B%(tm)  0  0  •••  0 

bn,x 

dm,  x 

0  0  •••  0  B%(tx)  B?(h)  •••  Bl{h) 

boiV 

d\,y 

0  0  •••  0  Bff(ta)  •••  BS(t2) 

Ky 

d2  ,y 

0  0  •••  0  Bg(tm)  B?(tm)  •••  B£(tm)  . 

bn,y  _ 

dm,y 
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Let  the  above  expression  be  the  residual  R( t).  We  see  that  R( t)  is  a  system  of  2m 
polynomials  and,  therefore,  minimizing  R( t)  is  in  some  sense  similar  to  root  solving. 

The  Gauss-Newton  method  proceeds  as  follows.  As  in  Newton’s  method,  we 
first  model  the  behavior  of  R{ t)  with 


Mc{  t)  =  R(te)  + 


(t  -  tc) 


which  is  analogous  to  Equation  (III. 5).  We  will  use  the  term  Jacobian  to  describe  the 
derivative  of  a  vector  function.  Therefore,  our  Jacobian  matrix  is 


dR(t i ,c)s  b  b  #  ,c)j 

dt\  dtm 


J(  U) 


dfi(tm}c)x  .  .  .  dR(tm,c)x 

dt\  dtm 

dR(titC)y  <  ^  #  dR(t\,c)y 

dt\  dtm 


dR(tm.c)y  m  '  #  dR(tm,c)y 
dt\  dtm 


Note  that 

- =  0  for  i  ^  j 

dtj  r 

which  also  holds  for  the  y  components.  Therefore,  our  Jacobian  matrix  has  the 
following  special  form 


J(  tc)  = 


dR(h,c)x 

dtx 

0 

0 

0 


0 

0 


rffi(U,c)y 
dt  1 

0 

0 

0 


0 

0 


0 

0  0 

0 

A  dR^im^eJx 

U  dim 

0 

0  0 

0 

A 

U  dtm 


(III-7) 
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To  get  an  improved  estimate  we  want  to  solve  for  t  so  that 


Mc{ t)  =  R( tc)  +  J(tc)(t  -  tc) 

=  0 


or,  equivalently 


J(tc)t  =  J(tc)tc  -  R( tc) 


(III.8) 


where  the  right  hand  side  is  a  known  vector  in  7i2m.  Note  that  Equation  (III.8)  is  a 
linear  least  squares  problem  for  the  vector  t.  Formally,  we  proceed  by  forming  the 
normal  equations 


Jr(tc)J(tc)t  =  JT(tc)  J(tc)tc  -  Jr(tc)i?(tc) 

where  JT(tc)J(tc)  is  a  square  matrix  and  invertible.  Though  one  could  argue  for  a 
more  numerically  stable  method  to  solve  this  least  squares  problem,  in  our  particular 
case  the  normal  equations  lead  to  a  simple  expression  for  the  change  to  make  in 
our  nodes  and  to  results  which  are  favorable  when  using  the  alternating  projection 
approach.  So,  after  multiplying  both  sides  by  (jT(tc)J(tc)j  we  have 

t  =  tc  -  (jT(tc)J(tc))  1  JT(tc)R(tc) 

Therefore,  our  improved  estimate  tc  is  given  by 

tc  =  tc  -  (jT(tc)J(tc))_1  JT(tc)R(tc)  (III.9) 

=  tc-A(tc) 


Our  matrix  J(tc)  from  Equation  (III.7)  has  a  special  form  so  that  the  right 
hand  side  of  Equation  (III.9)  simplifies.  First,  note  that  JT(tc)J(tc)  is  a  diagonal 
matrix  with  elements  on  the  main  diagonal  of  the  form 


(  dR{tj,c)x 

\  dti 


+ 


'  dR(t^c)y 

dti 


for  i  =  1, . . .  ,m 


(III.10) 
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Therefore,  (JT(tc) J(tc))  1  is  another  diagonal  matrix  with  elements  which  are  the 
reciprocals  of  those  described  by  Equation  (III.  10).  The  term  JT(tc)R(tc)  is  a  vector 
in  lZm  with  elements  of  the  form 


R(t  1  |  D(f  \  dR{ti, c)y 


(Ill'll) 


it,  '  •  "  dti 

Combining  the  results  from  Equation  (III. 10)  and  Equation  (III. 11),  we  have 
the  following  expression  for  each  element  of  A(tc), 


'dR(ti)s' 

dt; 


+ 


'dR(tj)y' 

,  dti  , 


-l 


D/a  \  dR{ti,c)x  ,  D/a  \  dR(tilC)y 
R\ti,c)x  '  tt[tt,c)y  dt. 


(III. 12) 


Equation  (III. 12)  is  easy  to  program  into  MATLAB  and  very  inexpensive. 
Note  that  tc  is  one  step  in  the  direction  of  the  nodes  which  will  minimize  R(t).  That 
is,  tc  is  one  step  in  the  direction  of  the  nearest  points. 

The  effectiveness  of  the  Gauss-Newton  method  depends  in  part  on  our  initial 
points  t  being  in  the  neighborhood  of  the  nearest  points.  Just  like  the  Newton 
method,  a  poor  initial  estimate  could  cause  the  method  to  fail.  Therefore,  we  use  the 
affine  invariant  angle  method  in  our  primary  solution  algorithm  to  get  the  initial  set 
of  nodes. 

3.  Stopping  Criteria 

Because  relative  error  is  a  more  meaningful  indicator  of  change  for  larger 
values,  the  stopping  criteria  used  in  this  paper  to  determine  is  determined 

by  the  value  of  ||  /?(tJ+1,  P 7+i )  lb-  When  the  2  norm  of  this  residual  is  greater  than 
one,  the  stopping  criteria  is  the  relative  error.  When  the  2  norm  of  the  residual  is 
less  than  or  equal  to  one,  the  stopping  criteria  is  the  absolute  error  expressed  as 


II  5 Pj+i )  —  P ,)  ||2 


where 


R(tj,Pj)  =  B(tj)Pj-D 
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is  the  residual  of  iteration  j.  Note  that  the  residual  depends  on  the  nodes  and  the 
control  points.  When  either  measure  of  error  gets  below  a  user  defined  value,  the 
function  stops  improving  the  fit  of  the  curve. 

We  could  also  use  the  absolute  error 

II  tj+l  _  II 2 

However,  a  small  change  in  our  estimates  does  not  necessarily  mean  we  are  close  to 
the  nearest  points.  It  is  the  curve  we  are  fitting  to  the  data  and  not  the  vector  t,  so 
it  is  best  to  use  the  curve  itself  to  indicate  when  we  are  close  enough  to  determining 

E.  PRIMARY  ALGORITHM 

The  following  is  the  primary  algorithm  this  paper  uses  to  determine  {P*,t*}. 
It  is  presented  to  aid  in  the  understanding  of  the  MATLAB  function  grail. m.  The 
MATLAB  functions  included  below  are  provided  in  the  Appendix. 

Step  one.  The  user  sends  gradl.m  the  data  matrix  D  €  P-mx2  in  the  sequence 
he  wants  the  data  ordered,  the  degree  Bezier  curve  to  fit  to  the  data,  and  an  exponent 
value  which  determines  the  stopping  criteria.  The  function  aff. '.angle,  m  is  called  as 
a  subroutine  and  returns  the  initial  set  of  nodes  ti  using  the  affine  invariant  angle 
method.  The  function  mxbern2.m  is  called  as  a  subroutine  and  returns  the  Bernstein 
matrix  P(ti).  Then,  the  linear  least  squares  parametric  functional  problem  P(ti) Pi  = 
D  is  solved  for  the  matrix  Pi  of  control  points.  Finally,  the  residual  P(t1?  Pi)  is 
calculated. 

Step  two.  Begin  the  while  loop.  The  stopping  criteria  is  checked,  and,  if  met, 
we  exit  the  while  loop.  Otherwise,  the  derivative  of  P(t;,  P,),  which  is  the  derivative 
of  B(t,-)P;,  is  determined  as  presented  in  Chapter  I  again  using  mxbern2.m.  We  now 
use  Equation  (III. 12)  to  obtain  the  improved  estimate,  where  tl+i  =  t,-.  Thus,  we 
take  only  one  step  in  the  direction  of  the  nearest  points.  Once  we  have  the  improved 
estimate  we  ensure  its  elements  are  within  the  interval  [0,1]. 
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Step  three.  Solve  the  least  squares  problem  for  the  matrix  P,+ 1 .  The  residual 
Pi+i )  is  calculated.  Return  to  step  two.  End  of  the  while  loop. 


30 


IV. 


METHOD  EXAMPLES 


In  this  chapter  we  will  consider  graphic  examples  of  the  different  MATLAB 
functions  used  while  researching  this  paper  to  determine  {P*,t*}.  The  difference 
between  the  functions  is  how  each  handles  the  nonlinear  least  squares  problem  for 
the  nearest  points.  The  MATLAB  functions  examined  in  this  chapter  are: 

1.  gradS. m.  Uses  the  gradient  method. 

2.  grad5.m.  Uses  the  NearestPoint  method. 

3.  gradl.m.  Uses  the  Gauss-Newton  method. 

A.  GRAPHICAL  COMPARISONS 

This  section  examines  graphical  examples  of  the  three  MATLAB  functions 
above  and  also  the  graphical  difference  between  the  two  affine  invariant  node  spacing 
methods  presented  in  Chapter  II. 

1.  Gradient  Method  and  Approaching 

As  stated  in  Chapter  III,  the  gradient  method  as  developed  for  this  paper  was 
insufficient  to  consistently  move  the  points  r  closer  to  the  nearest  points.  Figure  (8) 
and  Figure  (9)  on  page  37  show  that  the  algorithm  reaches  a  stage  where  the  points 
r  cycle  back  and  forth  along  the  curve.  The  two  curves  are  similar,  but  we  are  no 
longer  approaching  a  better  fit.  By  the  third  iteration  the  control  points  cycle  back 
and  forth  between 

0.8297  0.8911 
Pi  =  6.0947  9.5031 

5.0183  1.0117 

and 

0.7802  0.7811 
P2  =  5.9106  9.6305 

4.9913  0.9784 
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The  nodes  cycle  back  and  forth  between 


ti 


0.0000 

0.1626 

0.3566 

1.0000 


and 


t2 


0.0413 

0.1011 

0.4272 

1.0000 


Though  either  of  the  curves  seem  like  a  pretty  good  fit,  an  algorithm  that  approaches 
{.P*,t*}  is  what  any  user  expects  and,  therefore,  this  method  is  unacceptable. 


2.  Total  Least  Squares  Versus  Linear  Least  Squares 

We  might  ask  what  the  difference  is  between  a  fitted  curve  produced  by  tra¬ 
ditional  linear  least  squares  and  a  fitted  curve  produced  by  the  total  least  squares 
function  gradl.m.  Figure  (10)  and  Figure  (11)  on  page  38  are  both  cubic  curves  fitted 
to  the  same  data  set  but  using  these  two  different  approaches. 

We  notice  that  each  curve  makes  a  different  assumption  about  the  behavior 
of  the  data  about  the  point  di.  Most  important  to  the  user  is  that  the  linear  least 
squares  fit  is  only  assuming  error  in  the  y  values,  while  gradl.m  is  minimizing  error 
in  the  x  and  the  y  values.  This  is  more  practical  since  the  input,  or  the  x  values,  will 
also  often  contain  some  degree  of  error. 


3.  Effect  of  the  Initial  Set  of  Nodes 

We  will  now  see  how  the  initial  set  of  nodes  may  determine  the  fit  of  the  ap¬ 
proximating  curve.  Figure  (12)  and  Figure  (13)  on  page  39  show  the  Bezier  curves 
returned  by  gradl.m  using  two  different  affine  invariant  node  spacing  methods.  Fig¬ 
ure  (12)  reflects  the  affine  invariant  chord  method  while  Figure  (13)  reflects  the  affine 
invariant  angle  method.  Both  figures  show  the  nodes  associated  with  the  data  points. 
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Because  the  affine  invariant  angle  method  takes  into  account  the  bending  of 
the  data,  the  approximating  curve  fits  more  naturally.  This  is  opposed  to  what  we 
have  in  Figure  (12)  where  it  looks  like  the  curve  is  alternately  stretched  and  slack. 

4.  Failure  to  Maintain  Ordering  of  Data 

Figure  (14)  and  Figure  (15)  on  page  40  are  two  approximating  curves  for  the 
same  set  of  data.  Both  curves  were  generated  by  gradl.m  with  stopping  criteria  of 
10_1.  Figure  (14)  is  the  result  of  using  the  affine  invariant  chord  method  to  obtain 
the  initial  set  of  nodes  and  Figure  (15)  is  the  result  of  using  the  affine  invariant  angle 
method. 

In  the  same  figures,  notice  the  difference  in  interpolating  about  the  point  d4. 
When  the  order  of  the  data  is  not  maintained  the  user  gets  an  entirely  different 
picture  of  the  trend  in  the  data.  In  general,  whether  the  methods  presented  in  this 
paper  maintain  the  ordering  of  data  is  dependent  on  the  initial  set  of  nodes  and  the 
order  of  the  curve  the  user  fits  to  the  data.  For  example,  using  the  same  data  set  as 
in  Figure  (15)  and  a  degree  2  curve,  the  function  gradl.m ,  using  the  affine  invariant 
angle  method,  will  fail  to  maintain  the  order  of  the  data  about  d4. 

Consider  the  function  grad5.m.  Because  the  NearestPoint  method  is  free  to 
look  anywhere  along  the  curve  to  solve  the  nearest  point  problem,  grad5.m  more  often 
fails  to  maintain  the  ordering  of  data.  This  is  opposed  to  the  Gauss-Newton  method 
which  assumes  that  each  initial  point  r,-  is  in  the  neighborhood  of  the  nearest  point. 

5.  Gauss-Newton  Method  and  Data  with  Noise 

Consider  a  set  of  data  that  lies  on  a  cubic  curve  and  that  the  data  is  then 
transmitted  to  a  receiver.  In  the  transmission  of  the  data  a  small  amount  of  noise 
gets  added.  How  well  does  gradl.m  fit  the  data  with  the  noise  added,  and  how  much 
like  the  original  curve  is  the  resulting  fitted  curve?  Note,  in  the  following  examples 
all  random  sets  of  numbers  were  generated  using  the  MATLAB  rand  command  which 
returns  a  uniformly  distributed  set  of  numbers  on  the  interval  [0,  1].  Figure  (16)  on 
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page  41  shows  a  random  sampling  of  10  data  points  from  a  cubic  Bezier  curve  which 
was  generated  from  a  random  matrix  P.  Noise  was  added  to  each  data  point  by 
generating  a  random  set  of  points  around  the  unit  circle  and  scaling  each  point  by 
.012.  Figure  (17)  on  the  same  page  is  the  curve  fitted  by  grail.m. 

This  example  shows  one  of  the  limitations  of  grad7.m  even  when  using  an 
effective  initial  node  spacing  method:  grad7.m  views  the  cluster  of  data  points  in  the 
upper  right  hand  side  of  the  plot  as  just  another  set  of  points  to  fit  with  a  curve.  This 
is  why  we  get  the  sharp  bend  within  the  cluster.  The  user  who  wants  to  avoid  this 
type  of  behavior  could,  for  example,  substitute  one  data  point  for  the  entire  cluster 
or  use  a  weighted  least  squares  approach  to  fitting  the  data. 

Figure  (18)  on  page  42  is  the  same  set  of  data  but  now  fitted  with  a  degree  5 
curve.  This  plot  reflects  what  occurs  when  the  fitted  curve  has  too  much  freedom.  A 
degree  5  curve  has  more  freedom  than  needed  for  the  data,  and  grad7.m  makes  use 
of  all  the  freedom  available  in  order  to  improve  the  fit. 

6.  Gauss-Newton  and  Real  World  Data 

This  section  briefly  presents  a  real  world  problem  where  the  function  grad7.m 
could  be  used.  The  data  for  this  section  comes  from  recording  positions  along  a  road 
leading  to  Fort  Ord,  California  with  a  Global  Positioning  System  (GPS)  receiver. 
Many  similar  data  sets  are  gathered  using  several  different  receivers  along  the  same 
route  and  curves  are  then  fitted  to  the  data.  The  fitted  curves  are  used  to  determine 
the  bias  present  based  on  the  particular  satellites  being  used  by  the  receivers.  Imagine 
the  road  in  the  x-y  plane.  There  will  be  error  in  both  the  x  and  y  coordinates  of  each 
location  in  the  data  set,  so  we  will  want  to  fit  a  curve  to  the  data  in  the  total  least 
squares  sense.  Figure  (19)  on  page  43  is  the  data  and  the  degree  3  curve  fitted  to  the 
data  using  a  stopping  criteria  of  10-4. 
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B.  GAUSS-NEWTON  VERSUS  NEARESTPOINT 

We  now  look  specifically  at  the  performance  of  grad7.ni  and  grad5.m.  We  will 
consider  both  graphical  results  and  computer  time  as  indicators  of  performance.  Both 
functions  get  their  initial  set  of  nodes  using  the  affine  invariant  angle  method. 

1.  Graphical  Results 

We  will  consider  cubic  Bezier  curves  on  which  we  select  11  evenly  spaced 
points  with  respect  to  the  parameter  t  and  then  add  a  small  amount  of  random  noise, 
as  above,  to  each  point.  Each  of  the  two  functions  fits  a  cubic  Bezier  curve  to  the 
resulting  data  set  and  the  original  and  newly  generated  curve  are  displayed  on  one  plot 
to  compare  how  well  each  function  performs.  Figure  (20)  on  page  44  to  Figure  (25) 
on  page  46  are  pairs  of  plots  for  three  different  data  sets.  The  solid  line  is  the  curve 
returned  by  the  functions. 

Note  that  in  each  case  the  pair  of  plots  is  slightly  different.  Regardless,  the 
overall  graphical  comparison  of  the  two  functions,  after  conducting  many  other  ex¬ 
amples  not  shown,  is  that  they  perform  equally  well  except  when  grad5.m  fails  to 
maintain  the  ordering  of  data. 

2 ♦  Computer  Time  and  Iterations 

Table  I  on  page  36  reflects  how  much  computer  time  was  used  for  several  sizes 
of  data  sets  and  stopping  criteria  of  10-4.  Each  data  set  is  a  matrix  of  uniformly 
distributed  random  entries  and  ordered  with  respect  to  the  x  and  y  values.  Because 
the  NearestPoint  method  finds  each  new  node  separately,  it  is  part  of  a  for  loop 
within  grad5.m.  As  the  data  sets  get  larger,  the  function  slows  down  in  comparison 
to  grad7.m  where  the  form  of  the  normal  equations  in  the  Gauss-Newton  method 
remains  relatively  fast. 

C.  GENERAL  COMPARISON  OF  FUNCTIONS 

Table  II  on  page  36  reflects  the  general  results  of  researching  this  paper  by 
comparing  the  three  solution  functions  examined  in  this  chapter. 
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Size 

Time  (sec)  /  Iter 

grad7.m 

10 

3.0135  /  121 

1.7405  /  126 

100 

10.8809  /  53 

3.7208  /  91 

1000 

62.2660  /  36 

.9655  /  3 

Table  I.  Computer  Time  Used 


Function 

General  Results 

gradS.  m 

•  Inconsistently  indicates  how  large  a  step  to  take  in  the  direction 
of  the  nearest  point. 

grad5.m 

•  Faster  than  gradl.m  in  rare  cases. 

•  Graphically,  performs  equally  well  as  gradl.m. 

•  Code  is  already  provided  but  more  difficult  than  gradl.m  to 
examine  and  follow,  see  [Ref.  5]. 

•  More  readily  fails  to  maintain  the  ordering  of  data. 

•  More  expensive  to  run  on  larger  data  sets  and  many  smaller 
data  sets  as  seen  during  research. 

gradl.m 

•  Faster  than  grad5.m  in  most  cases. 

•  In  the  many  examples  conducted  during  research,  has  not  failed 
to  reach  a  reasonable  stopping  criteria. 

•  Simple  to  code  and  algorithm  is  easy  to  follow. 

Table  II.  General  Comparison  of  Functions 
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Figure  8.  Gradient  Method  Fails  to  Converge 


0  2  4  6  8 


Figure  9.  Gradient  Method  Fails  to  Converge 
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Figure  10.  Linear  Least  Squares  Fit 


Figure  11.  Gauss-Newton  Fit 
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Figure  13.  Affine  Invariant  Angle  Method 
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Figure  14.  Failure  to  Maintain  Order 


Figure  15.  Affine  Invariant  Angle  Maintains  Order 
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Figure  25.  Data  Set  Three  and  grad5.vn 


1.7 
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APPENDIX.  MATLAB  FUNCTIONS 


GRAD7.M 


function  [p,t,info]  =  grad7(d, deg, stop) 

'/«  GRAD7.M  This  function  takes  a  given  set  of  ordered  data  and  returns 
%  the  parameter  values  (nodes)  and  control  points  which  determine  the 
%  Bezier  curve  that  fits  the  data  in  the  total  least  squares  sense. 

'/.  Instead  of  minimizing  the  vertical  distance  to  the  curve  we  minimize 
y,  both  vertical  and  horizontal  distance.  The  central  feature  of  this 
%  function  is  the  use  of  the  Gauss-Newton  method  to  estimate  the 
y,  nearest  point  along  the  Bezier  curve  from  each  data  point. 


%  Input  Arguments: 

%  d 

y. 

‘/o  deg 

% 

y  stop 

y 


(i  x  2)  matrix  of  ordered  data  points, 
i=2,3, . . . 

degree  of  the  curve  the  user  wants  to  fit 
to  the  data 

stopping  criteria  number  which  in  the 
algorithm  becomes  10“ (stop) 


y  Output  Arguments: 


y  info 

y 

y 


control  points  for  best  fit  Bezier  curve 
nodes  for  best  fit  Bezier  curve 
(2  x  1)  vector  which  has  the  final  norm  of 
the  residual  and  the  number  of  iterations 
to  convergence 


y  Basic  Algorithm: 

y 

y 

y 

y 

y 

y 

y 

y 

y 

y 

y 


1.  Determine  and  plot  the  best  fit  Bezier  curve  by 
solving  a  linear  least  squares  problem  using  an 
initial  set  of  nodes  based  on  the 

’spread'  and  ’bend’  of  the  data. 

2.  Perform  the  following  until  the  stopping 
criteria  is  met : 

a.  Determine  a  new  set  of  nodes 

which  estimates  points  on  the  current  Bezier 
points. 
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X 

y,  b.  Determine  the  control  points  for  the  Bezier 

*/,  curve  which  will  best  fit  the  data  using  the 

*/,  new  set  of  nodes. 

*/,  ALGORITHM  STEP  1. 

'/,  Check  the  hold  state  so  it  can  be  returned  to  how  the  user  had  it. 

if  (ishold) 
hold_was_off  =0; 
else 

hold_was_off  =  1; 
end 

y,  ’i’  is  the  number  of  data  points  and  ’j’  is  the  number  of  control 
'/,  points  required  for  the  degree  of  the  curve  specified  by  the  user, 
y,  af f .angle (d)  returns  the  initial  set  of  nodes,  a  vector  of  ’i’ 

%  elements,  based  on  the  ’spread’  and  ’bend’  of  the  data. 

i  =  size(d, 1) ; 
j  =  deg+1 ; 
t  =  aff _angle(d)  ; 

%  ’bez.mat’  is  a  (i  x  j)  matrix  which  is  determined  by  the  nodes 
y,  and  the  degree  of  the  curve  desired.  Since  we  would 
%  like  to  fit  the  data  exactly,  we  should  solve: 

X 

'/,  d  =  bez_mat  *  p 

X 

'/,  for  the  desired  (j  x  2)  matrix  of  control  points  ’p’ .  This  is  a 
'/,  linear  least  squares  problem  since  the  nodes  are  known.  ’p’ 
y,  is  determined  by: 

X 

y,  p  =  pinv(bez_mat)  *  d 

X 

'/,  which  is  the  same  as  using  the  mat  lab  ’backslash’  command. 

bez.mat  =  mxbern2(t ,deg) ; 
p  =  bez_mat  \  d; 

'/,  Once  we  have  the  control  points  'p’  ,  we  can  determine  the  Bezier 
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7  curve  and  the  points  on  the  curve  associated  with  the  initial  vector 
7  t.  'y  ’  is  the  (i  x  2)  matrix  of  points  on  the  Bezier  curve 
7.  associated  with  the  initial  vector  t.  ’tl’  is  a  closely  spaced 
7.  set  of  parameter  values  which  will  produce  the  Bezier  matrix, 
7'bez_mat_l' ,  which  will  give  enough  points  on  the  fitted  Bezier  curve 
7  for  matlab  to  plot  a  smooth  looking  curve:  'yl'  is  the  matrix  of 
7  these  points. 

y  =  bez.mat  *  p; 

tl  =  [0:1/128:1] '; 

bez_mat_l  =  mxbern2(tl,deg) ; 

yl  =  bez_mat_l  *  p; 

7  Now  we  plot  the  results  for  the  user.  A  legend  is  provided  and  the 
7  axes  are  made  'equal'  to  eliminate  distortions. 

7  'Pause'  will  keep  the  plot  displayed  and  delay  this  function  until 
7  the  user  presses  any  key  on  the  keyboard. 

figure 

plot(yl(: ,1) ,yl( : ,2)) 
hold  on 

plot(p(: ,1) ,p(: ,2) , '*') 
plot (y ( : , 1) ,y( : ,2) , 'o' ) 
plot (d( : ,1) ,d(: ,2) , '+') 
axis( ' equal ' ) 

legend('-' , 'Fitted  Curve’ 'Control  Points',... 

'o' , 'Initial  Times’ , ’+' , 'Data  Points’) 

pause 

7  ALGORITHM  STEP  2. 

7  We  will  perform  steps  2a  and  2b  within  a  'while'  loop  with  stopping 
7  criteria  to  meet  in  order  to  end  the  loop.  Our  stopping  criteria 
7  is  based  on  the  relative  change  of  the  residual.  We  also  initialize 
7  the  iteration  counter  to  zero.  The  'tic'  command  starts  a  clock  so 
7  that  the  user  will  know  how  much  computer  time  was  required  to 
7  meet  the  stopping  criteria. 

iter  =  0; 

resid_old  =  0; 

resid_new  =  bez_mat  *  p  -  d; 
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tic 


while  norm(resid_new  -  resid_old)/max(l  ,  norm (res id_new))  >  10“ (stop) 

X  Algorithm  step  2a.  Each  iteration  of  the  'while'  loop  produces  a 
X  new  vector  t  by  solving  the  nonlinear  least  squares  problem 

X 

X  B(t)  *  p  =  d 

X 

'/,  where  'p'  is  the  vector  of  control  points,  'd'  the  matrix  of  data 
X  points,  and  'B(t)'  is  the  Bernstein  matrix  with  unknown  parameter 
%  values.  Matrix  'B(t)'  is  nonlinear  in  terms  of  the  parameter 
X  values.  The  method  used  in  this  function  is  the  Gauss-Newton  method 
X  where  we  let  the  residual  be  R(t)  =  B(t)  *  p  -  d  and  we  let  the 
X  Jacobian  matrix  'J'  be  such  that  J_i,j  =  dB(t_i)/dt_j .  I.E.,  the 

X  (i,j)~th  element  of  'J'  is  the  slope  along  the  Bezier  curve  at  the 
X  i~th  parameter  value  with  respect  to  the  j“th  parameter 
X  value.  The  Gauss-Newton  method  says  that  the  change  in  parameter 
X  values  which  will  minimize  the  residual  is  given  by 

X 

X  delta_t  =  -inv(J '  *  J)  *  J’  *  R 

X 

X  where  'J'  and  'R'  are  evaluated  at  the  current  parameter  values. 

X  To  compute  the  gradient  at  a  point  on  the  Bezier  curve,  we  need  the 
X  forward  difference  of  the  control  points.  I.E.,  we  need  a 
X  (j-1  x  2)  matrix  where  the  entries  are  p_i+l  -  p_i  for  i=l,..,j-l. 

X  The  slope,  'deriv',  is  then  determined  multiplying  the  Bernstein 
X  matrix  for  a  degree-minus-one  curve  by  the  forward  difference  matrix 
X  of  control  points  and  then  multiplying  by  the  degree  of  the  original 
X  Bezier  curve. 


deriv  =  deg  *  mxbern2(t ,deg-l)  *  (p(2:j,:)  -  p(l:j-l,:)); 


X  Now  we  have  what  we  need  for  the  Gauss-Newton  method.  Since  to  use 
X  this  method  the  residual  needs  to  be  a  vector,  we  simply  take  the 
X  y-values  of  the  residual  and  append  them  to  the  bottom  of  the 
X  x-values.  This  is  done  using  'resid(:)’.  Similarly,  the  Jacobian 
X  matrix's  elements  must  be  for  the  new  vector  'resid' .  Each  element 
X  of  the  matrix  'J'  is  the  x-value  or  y-value  of  the  slope  at  each 
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%  parameter's  point.  Since  the  slope  at  any  point  on  the  curve 
%  doesn't  charge  with  any  parameter  other  than  it's  own,  most  of  the 
•/.  entries  in  'J'  are  zero. 

’/,  't',  the  new  nodes  are  given  by  't  =  t  -  delta_t' 

%  using  the  formula  above.  Because  (J’  *  J)  and  J'  have  a  form  we 
y,  know  in  advance,  we  can  form  'delta_t'  using  less  computer  time  and 
*/,  flops. 

t  =  t  -  (deriv(: ,1) .*resid_new( : ,1)  +  deriv( : ,2) .*resid_new( : ,2))  ... 
./  [deriv( : ,1) .“2  +  deriv(: ,2) .~2]  ; 

y  Now  we  have  a  new  vector  t,  but  we  want  to  make  sure  the  values 
%  are  between  0  and  1.  In  most  cases  with  well  behaved  data,  the 
y  following  rescaling  of  the  nodes  also  results  in  the  endpoints 
'/,  being  associated  with  the  nodes  t_l=0  and  t_m=l . 

t  =  -min(t)*ones(i , 1)  +  t; 
t  =  t/max(t) ; 

y  With  ordered  nodes  we  now  want  the  new  control  points  so 
y,  that  we  can  reproduce  the  points  'tau'  on  the  curve  for  the 
y  next  iteration  of  the  'while'  loop  or  to  be  used  in  the  final  plot 
'/,  if  the  stopping  criteria  is  met.  Note  that  if  the  condition  is  met 
y  we  can  also  use  the  below  ’bez_mat'  matrix  for  the  final  plot.  We 
y  also  compute  a  new  value  for  'resid.new'  and  update  the  iteration 
y  counter. 

bez.mat  =  mxbern2 (t , deg) ; 
p  =  bez.mat  \  d; 

resid_old  =  resid_new; 
resid_new  =  bez_mat  *  p  -  d; 

iter  =  iter  +  1;  * 

y  End  'while'  loop  and  stop  computer  time  clock  to  show  user  how  long 
y  it  took  to  converge. 

end 

toe 
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V,  Now  that  we  have  the  best  fit  vector  t  and  the  associated 
7,  matrix  ’p’  of  control  points,  we  are  ready  to  plot  the  final 
'/,  solution  for  the  user.  ’y’  are  the  points  on  the  Bezier  curve 
#/,  associated  with  the  vector  t.  ’yl’  are  the  closely  spaced 
%  points  on  the  Bezier  curve  which  matlab  will  use  to  plot  a  smooth 
7,  looking  curve. 

y  =  bez_mat  *  p; 
yl  =  bez_mat_l  *  p; 

*/,  Finally,  we  clear  the  current  plot  and  plot  the  results, 
figure 

plot(yl( : ,1) ,yl ( : ,2) ) 
hold  on 

plot(p( : ,1) ,p(: ,2)  ,  ’*’) 
plot(y ( : , 1) ,y( : ,2), ’o') 
plot (d( : , 1) ,d( : ,2) , ’  +  ’) 
axis (’ equal' ) 

7,  Include  legend  on  final  plot . 

legend( ’Fitted  Curve’ , ’Control  Points’,... 

’ o ’, ’Nodes ’,’+’, ’Data  Points’) 

*/  Return  hold  state  to  however  the  user  had  it. 

if  (hold_was_off ) 
hold  off; 
end 

info  =  [norm(bez_mat*p-d) ,  iter] ; 


X  End  GRAD7.M 


MXBERN2.M 


function  [B]  =  mxbern2(t,d) 

7,  MXBERN2.M  This  function  creates  a  Bernstein  matrix  of  degree  d 
*/.  using  the  values  in  the  column  vector  t .  A  Bernstein  matrix 
V,  is  a  generalized  Vandermonde  matrix  whose  (i,j)  entry  is 
#/«  B_-Cj-l}~d(t_i)  .  The  vector  t  must  be  a  column  vector  with  values 
%  between  0  and  1.  Copyright  (c)  3  December  1994  by  Carlos  F.  Borges. 
'/,  All  rights  reserved.  Modified  by  permission  for  this  paper. 

[n  m]  =  size(t); 
if  (m  ~=  1) 

error(’t  must  be  a  column  vector.’); 
end 

%  Check  to  see  if  nodes  axe  within  [0,1]. 

if  min(t)  <  0  |  max(t)  >  1 
error (’nodes  axe  not  within  [0,1]’) 
end 

y.  Build  the  Bernstein  matrix. 

ct  =  1  -  t; 

B  =  zeros(n,d+l) ; 

for  i  =  0  :  d 

B(:,i+1)  =  (t . ~i) .*(ct . ~ (d-i) ) ; 
end 

y,  If  d  >  22  we  form  the  Bernstein  matrix  differently  to 
y,  avoid  roundoff. 

if  d  <  23 

B  =  B*diag(  [1  cumprod(d:-l:l)  ./cumprod(l:d)]  )  ,* 
else 

B  =  B*diag(diag(fliplr(pascal(d+1))))  ; 
end 

X  End  MXBERN2.M 
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AFF_AN  GLE.M 


function  [h]  =  aff_angle(X) 

AFF.ANGLE.M  This  function  returns  an  affine  invariant  vector  of 
%  nodes  for  a  given  set  of  ordered  data  X.  It  is 
'/,  assumed  that  the  user  sends  this  function  the  data  arranged  in  a 
'/,  (n  x  2)  matrix  ordered  from  row  one  to  row  n. 

'/,  Obtaining  the  covariance  matrix  A  is  from  a  function  AFF_KNT.M  which 
%  is  copyrighted  on  3  December  1994  by  Carlos  F.  Borges  and  appears 
'/,  here  with  his  approval.  The  affine  invariant  angle  method  of 
*/,  obtaining  nodes  is  found  in  a  paper  by  Thomas  A.  Foley 
'/,  and  Gregory  M.  Nielson,  "Knot  Selection  for  Parametric  Spline 
%  Interpolation",  in  the  book  "Mathematical  Methods  in  Computer  Aided 
%  Geometric  Design",  Academic  Press,  Inc.,  1989.  Notation  from  this 
'/,  paper  is  used  to  annotate  the  steps  in  this  algorithm. 

n  =  size(X, 1) ; 

Xbar  =  X  -  ones (size (X))  *  diag(mean(X) ) ; 

Xcov  =  Xbar’  *  Xbar/n; 

A  =  inv(Xcov) ; 

%  Obtain  the  node  spacing  values  using  the  metric 

y 

y  t_i  =  M[X](X_i,X_(i+l)). 

y 


V  =  X(2:n, :)  -  X(l:n-1,:); 
t  =  diag(V  *  k  *  V’)  .“  (1/2); 

'/,  Obtain  the  values  for 

y 

y  M“2[X](X_(i-l),X(i+l)) 

y 

V_2  =  X(3:n, :)  -  X(l:n-2, :); 
t_2  =  diag(V_2  *  A  *  V_ 2’); 


54 


•/.  Get  theta_i  values.  This  is  what  takes  into  account  the  bending 
V,  of  the  data. 


theta  =  zeros (n-1 , 1) ; 


for  j  =  2: n-1 


theta(j)  =  min(  pi  -  acos(  (t(j-l)“2  +  t(j)~2  -  ... 

t_2(j-l))/(2*t(j)*t(j-l))  )  ,  pi/2); 


end 

%  Obtain  the  affine  invariant  angle  node  spacing  values  h_i. 
h  =  zeros (n-1 , 1) ; 

h(l)  =  t(l)*(  1  +  (1.5*theta(2)*t(2))/(t(l)  +  t(2))  ); 

for  j  =  2:n-2 

h(j)  =  t(j)  *  (  1  +  (1.5*theta(j)*t(j-l))/ (t(j-l)+t(j))  +.  .  . 

(1 . 5*theta(j+l)*t (j+1) )/(t (j )+  t(j+l))  ); 


end 

h(n-l)  =  t(n-l)  *  (  1  +  (1.5*theta(n-l)*t(n-2))/(t(n-2)+t(n-l))  ); 

•/.  Now  that  we  have  the  node  spacing  values,  we  want  to  normalize 
'/.  them  so  that  they  are  within  [0,1],  with  the  first  data  point  being 
'/o  associated  with  the  value  zero  and  the  last  data  point  with  the 
7,  value  one. 

h  =  [0 ;  h]  ; 

h  =  cumsum(h) ; 

h  =  h  /  h(n) ; 

*/.  End  AFF.ANGLE.M 
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